S E M I N A R

 

Lazy Learning Techniques for Information Extraction

 

Rifat Ozcan
Ph.D Student
Computer Engineering Department
Bilkent University

Information Extraction (IE) is the task of obtaining structured information from unstructured or semi-structured (such as html) text documents. Several machine learning techniques are used in the current state of the art IE systems. Many of these systems require human tagged training examples for learning extraction pattern. Eager learning approach uses all available training examples before seeing any test data and learn an extraction pattern(s). It requires long learning time but testing phase is fast. Use of newly obtained tagged data in this approach is a problem since it requires learning with all training examples again. Instead of eager learning, lazy learning chooses a small subset of training examples for each test data and learns a model for that test data. This approach does not employ any learning before seeing test data. This study analyzes the effectiveness of lazy learning technique for information extraction.

 

DATE: 14 May, 2007, Monday@ 15:40
PLACE: EA 409